Back

Proteins: Structure, Function, and Bioinformatics

Wiley

Preprints posted in the last 30 days, ranked by how well they match Proteins: Structure, Function, and Bioinformatics's content profile, based on 82 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Structural analysis of Helicobacter pylori glutamate racemase in a monoclinic crystal form

Spiliopoulou, M.; Schulz, E. C.

2026-04-03 biochemistry 10.64898/2026.04.02.716094 medRxiv
Top 0.1%
14.4%
Show abstract

Glutamate racemase (MurI) catalyzes the stereochemical interconversion of L-glutamate to D-glutamate, a key element of bacterial peptidoglycan biosynthesis. In this study, we present the crystal structure of Helicobacter pylori glutamate racemase at 1.43 [A] and in monoclinic symmetry, as previously reported models, but different unit-cell parameters. The present model contains a single dimer and retains the previously described head-to-head dimer arrangement. The differences between the models arise from variations in unit-cell parameters, which lead to altered crystal packing interactions rather than changes in the quaternary assembly. The monomeric fold and active-site architecture remain conserved and are consistent with the catalytic features described for bacterial glutamate racemases. This structure provides an updated, high-resolution structural model for H. pylori glutamate racemase and highlights the variability in crystal packing within the same space group.

2
A conserved isoleucine gates the diffusion of small ligands to the active site of NiFe CO-dehydrogenase

Opdam, L.; Meneghello, M.; Guendon, C.; Chargelegue, J.; Fasano, A.; Jacq-Bailly, A.; Leger, C.; Fourmond, V.

2026-03-21 biochemistry 10.64898/2026.03.19.713016 medRxiv
Top 0.1%
7.3%
Show abstract

CO dehydrogenases (CODH) are metalloenzymes that reversibly oxidize CO to CO2, at a buried NiFe4S4 active site. The substrates, CO and CO2, need therefore to be transported through the protein matrix to reach the active site. The most likely pathway for intra-protein diffusion is the hydrophobic channel identified in the crystal structures. Here, we use site-directed mutagenesis to study the highly conserved isoleucine 563 of Thermococcus sp. AM4 CODH2. Mutations at this position change the biochemical properties (KM for CO, product inhibition constant, catalytic bias...), and increase the resistance of the enzyme to the inhibitor O2, showing that isoleucine 563 indeed lines the gas channel. The I563F mutation decreases the bimolecular rate constant of inhibition by O2 15-fold, and increases the IC50 20-fold, which is the strongest improvement in O2 resistance reported so far. We show that the size of the introduced amino acids is less important than their flexibility - along with the size of the cavity formed near the active site in the channel. We also conclude that O2 access to the active site cannot be slowed down without also affecting CO diffusion. This tradeoff will have to be considered in further attempts to use site-directed mutagenesis to make CODHs more O2 tolerant.

3
Impact of the MX segment on the biogenesis of α7 nACh receptors

Do, Q. H.; Kim Cavdar, I.; Grozdanov, P.; Theriot, J. J.; Ramani, R.; Jansen, M.

2026-04-06 neuroscience 10.64898/2026.04.02.715926 medRxiv
Top 0.1%
4.4%
Show abstract

Nicotinic acetylcholine receptors (nAChRs) belong to the pentameric ligand-gated ion channel superfamily (pLGICs). Among them, the neuronal homomeric 7 nAChR is highly permeable to calcium and plays critical roles in synaptic transmission, cell signaling, and inflammation modulation. The biogenesis of 7 nAChRs is enhanced by the chaperone proteins RIC-3 and NACHO. Previously, we reported a motif in the 5-HT3A receptor, another pLGIC, involved in RIC-3 modulation. Residues in this motif are conserved and also found within the L1-MX segment of the 7 nACh subunit. We therefore explored the regulatory roles of these conserved residues in the biogenesis of 7 nAChRs using multiple approaches, including heterologous expression in Xenopus laevis oocytes, mutagenesis, pull-down assays, cell-surface labeling, and two-electrode voltage-clamp (TEVC) recordings. We find that synthetic 7 L1-MX peptide interacts with both RIC-3 and NACHO. In particular, conserved residues W330, R332, and L336 in the L1-MX positively regulates the assembly of 7 oligomers and the biogenesis of 7nAChR. In presence of residues W330, R332, and L336, NACHO promotes an assembly of an 7 pentamer which is resistant to strong denaturing conditions. NACHO-promoted 7 pentamer is also resistant to Endo H enzyme. Sensitivity of the pentamer to moderate temperatures (37 {degrees}C, 45 {degrees}C, and 50 {degrees}C) suggests that NACHO stabilizes the pentamer via non-covalent interactions. In contrast, Ala replacements at these residues disrupt the biogenesis and abolish 7 current. NACHO and RIC-3 co-expression yields partial rescue of functional expression for some Ala replacement constructs. SUMMARYThis work identifies regulatory roles of conserved residues W330, R332, and L336 in the biogenesis of 7 nAChR. This discovery positions MX subdomain as a promising target for future drug development that can minimize adverse effects.

4
Teaching Diffusion Models Physics: Reinforcement Learning for Physically Valid Diffusion-Based Docking

Broster, J. H.; Popovic, B.; Kondinskaia, D.; Deane, C. M.; Imrie, F.

2026-03-27 bioinformatics 10.64898/2026.03.25.714128 medRxiv
Top 0.1%
4.1%
Show abstract

Molecular docking aims to predict the binding conformation of a small molecule to its protein target. Recent work has proposed diffusion models for this task, from rigid-body docking that diffuses over ligand degrees of freedom to co-folding approaches that jointly generate protein structure and ligand pose. However, diffusion-based docking models have been shown to frequently produce physically implausible poses and fail to consistently recover key protein-ligand interactions. To address this, we introduce a reinforcement learning framework for training diffusion-based docking models directly on non-differentiable objectives. Fine-tuning DiffDock-Pocket for physical validity with our approach substantially increases the number of generated poses that are physically valid and interaction-preserving, with no increase in inference-time compute. Importantly, this comes without sacrificing structural accuracy; in fact, our approach increases the proportion of structures with near-native poses. These effects are most pronounced for protein targets that are dissimilar to the training data. Our fine-tuned DiffDock-Pocket model outperforms both classical docking algorithms and machine learning-based approaches on the PoseBusters set. Our results demonstrate that reinforcement learning can teach diffusion-based docking models to better respect physical constraints and recover key interactions, without the requirement to rely on inference-time corrections.

5
The concentric beta-barrel hypothesis for amyloids: Models of soluble and transmembrane amyloid-beta 42 oligomers and channels composed of identical subunits and GM1 gangliosides.

Guy, H. R.; Durell, S. R.; Shafrir, Y.

2026-03-23 neuroscience 10.64898/2026.03.19.711324 medRxiv
Top 0.1%
3.9%
Show abstract

Soluble oligomers and transmembrane channels formed by the 42-residue variant of amyloid beta (A{beta}42) play key roles in Alzheimers disease. Unfortunately, detailed structures of these assemblies have not been determined. Our group addresses this problem by developing atomic scale models. Previously we proposed that both soluble A{beta}42 oligomers and transmembrane channels have symmetric concentric {beta}-barrel structures. Here we expand this hypothesis to include GM1 gangliosides and sometimes cholesterol and lattice models of channel assemblies. The presence of GM1 gangliosides increases the toxicity of A{beta}42, enhances its ability to penetrate liposome membranes, and facilitates interactions between adjacent liposomes. Although the conformations of numerous model assemblies vary, in these models the carboxyl group of GM1 always binds to side-chains of histidine 13 and/or histidine 14. Our soluble oligomer models are consistent with electron microscopy images of beaded annular protofibrils. Our models of membrane-bound assemblies are consistent with the following: freeze-fracture and atomic force microscopy images of A{beta}42 in lipid bilayers, secondary structure results, the calcium hypothesis of Alzheimers Disease, effects of lithium depletion on AD, established {beta}-barrel theory, and energetic criteria.

6
Is metabolism spatially optimized? Structural modeling of consecutive enzyme pairs reveals no evidence for spatial optimization of catalytic site proximity.

Algorta, J.; Walther, D.

2026-03-26 bioinformatics 10.64898/2026.03.24.713955 medRxiv
Top 0.2%
3.6%
Show abstract

Metabolic pathways are often hypothesized to benefit from the spatial organization of enzymes, facilitating substrate transfer through mechanisms such as metabolic channeling or metabolon formation. However, it remains unclear whether the spatial proximity of catalytic sites represents a general organizational principle of metabolism or is restricted to specific pathways. Here, we investigate whether consecutive enzymes in metabolic pathways, when physically interacting, exhibit structurally optimized arrangements that minimize distances between their catalytic sites, thereby increasing metabolite transfer efficiency from one enzyme to the next. We first evaluated the ability of current protein-protein interaction prediction methods, including AlphaFold2, AlphaFold3, ESMFold, and HDOCK, to model weak and transient interactions using a benchmark dataset of 112 low-affinity protein dimers from PDBbind. AlphaFold-based approaches performed best in recovering correct interaction geometries, while ESMFold showed limited performance. We further assessed several confidence metrics and identified ipTM, ipSAE, and VoroIF-GNN as the most informative predictors of correct interaction conformations. In addition to simple Euclidean distance metrics, we developed a computational procedure to estimate shortest accessible space paths between catalytic sites in predicted enzyme-enzyme complexes. Applying this framework to 107 consecutive enzyme pairs in E.coli revealed an increased tendency for consecutive enzymes to interact, but no systematic evidence that interacting enzymes position their catalytic sites in spatially optimized configurations. In the predicted complex conformations, catalytic sites tend not to be positioned closer than expected at random. The developed computational workflow provides a general framework for analyzing structural aspects of metabolic organization.

7
Comparing Random and Natural RNA Boltzmann Ensembles

Khan, H.; Garcia-Galindo, P.; Ahnert, S. E.; Dingle, K.

2026-04-01 biophysics 10.64898/2026.03.31.715513 medRxiv
Top 0.2%
3.5%
Show abstract

A morphospace is an abstract space of theoretically possible biological traits, shapes, or property values. It is interesting to explore which parts of a morphospace life occupies, as compared to those parts which could be occupied, but are not. Comparing random and natural non-coding (nc) RNA secondary structures is an established approach to studying morphospace occupation for RNA structures. Most earlier studies have focused on the minimum free energy (MFE) structure, while relatively few have looked at the Boltzmann distribution, describing the ensemble of energetically suboptimal RNA folds. These suboptimal structures may have important roles and functions, and hence should be examined carefully. Here we compare random and natural ncRNA in terms of their Boltzmann distributions, finding that natural RNA tend to have very similar profiles to random RNA, with the main difference being that natural RNA are slightly more energetically stable, except for very short sequences (20 to 30 nucleotides) which tend to be slightly less stable. We infer that natural ncRNA occupy similar parts of the morphospace that random RNA do, indicating that the biophysics of the genotype-phenotype map largely determines the ensemble properties of ncRNA.

8
IDPForge: Deep Learning of Proteins with Global and Local Regions of Disorder

De Castro, S.; Zhang, O.; Liu, Z. H.; Forman-Kay, J. D.; Head-Gordon, T.

2026-03-27 biophysics 10.64898/2026.03.25.714313 medRxiv
Top 0.2%
2.9%
Show abstract

Although machine learning has transformed protein structure prediction of folded protein ground states with remarkable accuracy, intrinsically disordered proteins and regions (IDPs/IDRs) are defined by diverse and dynamical structural ensembles that are predicted with low confidence by algorithms such as AlphaFold and RoseTTAFold. We present a new machine learning method, IDPForge (Intrinsically Disordered Protein, FOlded and disordered Region GEnerator), that exploits a transformer protein language diffusion model to create all-atom IDP ensembles and IDR disordered ensembles that maintains the folded domains. IDPForge does not require sequence-specific training, back transformations from coarse-grained representations, nor ensemble reweighting, as in general the created IDP/IDR conformational ensembles show good agreement with solution experimental data, and options for biasing with experimental restraints are provided if desired. We envision that IDPForge with these diverse capabilities will facilitate integrative and structural studies for proteins that contain intrinsic disorder, and is available as an open source resource for general use.

9
Structural analyses of Trichomonas vaginalis pyrophosphate-dependent phosphofructokinase (TvPPi-PFK)

Chiu, A.; Liu, L.; Seibold, S.; Battaile, K.; Craig, J.; Harmon, E.; Subramanian, S.; Chakafana, G.; Early, J.; Cron, L.; Staker, B.; Myler, P. J.; Lovell, S. J.; Van Voorhis, W.; Asojo, O.

2026-03-28 biochemistry 10.64898/2026.03.28.715000 medRxiv
Top 0.3%
2.4%
Show abstract

Trichomonas vaginalis causes trichomoniasis, the most common non-viral sexually transmitted disease in humans. T. vaginalis pyrophosphate-dependent phosphofructokinase (TvPPi-PFK) is a putative target for rational, structure-based drug discovery, given its absence in mammals and its importance for parasite survival. TvPPi-PFK is a cytosolic enzyme that catalyzes the phosphorylation of fructose-6-phosphate using pyrophosphate (PPi) as the phosphoryl donor. This reversible reaction, catalyzed by TvPPi-PFK, is the first committed step in glycolysis. Its reverse reaction is vital for gluconeogenesis in T. vaginalis. The purification, crystallization, structure determination, and preliminary structure-functional analyses of three crystal structures of TvPPi-PFK are presented. All three structures organize as tetramers with the conserved motifs essential for pyrophosphate binding and PPi-PFK catalytic activity. Comparative analysis with structural neighbors from other organisms demonstrated that despite sharing <29% sequence identity, TvPPi-PFKs protomer shares overall topology with both PPi- and ATP-dependent PFKs. Mass photometry confirmed that TvPPi-PFK formed tetramers under near-physiological conditions. Unexpectedly, TvPPi-PFK crystals dephosphorylate ATP to AMP during soaking. In all three structures, either ATP or AMP is bound at the enzymes dimer interface, typical of ATP-PFKs, but a novel finding for PPi-PFKs. Furthermore, a sugar phosphate binding site was observed in proximity to the ATP-binding site. Thus, the three reported TvPPi-PFK structures validate its established PPi-dependent activity while revealing previously unreported ATP and sugar phosphate binding. This study also lays a foundation for future research into putative ATP-dependent activity of TvPPi-PFK and for evaluating known phosphofructokinase inhibitors as potential therapeutics for trichomoniasis. These findings expand our understanding of PFK superfamily diversity and support the continued exploration of TvPPi-PFK as a drug target for trichomoniasis. SynopsisThe production, crystallization, and three crystal structures of a pyrophosphate-dependent phosphofructokinase from Trichomonas vaginalis (TvPPi-PFK) reveal ATP binding and structural similarity to both ATP-dependent and pyrophosphate-dependent phosphofructokinases. TvPPi-PFK dephosphorylates ATP and has a novel ATP-PFK-like ATP-binding cavity.

10
c-di-AMP inactivates a K+/H+ antiporter in Bacillus subtilis

Figueiredo-Costa, I. R.; Lorga-Gomes, M. M.; Sousa-Moreira, S. C.; Matas, I. M.; Morais-Cabral, J. H.

2026-03-25 biochemistry 10.64898/2026.03.23.713699 medRxiv
Top 0.3%
2.3%
Show abstract

c-di-AMP is a bacterial second messenger with the crucial role of regulating turgor and osmotic adaptation. Due to the importance of intracellular K+ for osmotic balance, c-di-AMP controls the import and export of K+ by regulating the activity and transcription level of K+ transporters and channels. It has been postulated that c-di-AMP inactivates K+ import and activates K+ export. To gain a full understanding of the properties the K+ machinery in the Gram-positive model organism Bacillus subtilis and in particular, of how the machinery is regulated by c-di-AMP, we characterized the molecular properties of CpaA, a cation/H+ antiporter that has been shown to bind the dinucleotide. We determined the crystal structure of the cytosolic RCK domain with and without c-di-AMP and performed a functional characterization of full-length CpaA using a fluorescence-based flux assay. We found that c-di-AMP binds on the interface of the RCK-C subdomain but only small structural differences are detected between the apo- and holo-structure. We determined that CpaA is more active at high pH and that it slightly favors K+ over Na+ for exchange with H+. Unexpectedly, CpaA is inactivated by c-di-AMP with a K1/2 of inactivation around 1 {micro}M. Our results reinforce the emerging view that regulation of the bacterial K+ machinery by c-di-AMP is more complex than previously thought and that a detailed characterization of the molecular properties of the individual protein components and of how their activity is integrated is necessary for a complete view of the machinery physiological function.

11
Both ATP and Mg2+ are Required for High-Affinity Binding of Indolmycin to Human Mitochondrial Tryptophanyl-tRNA Synthetase

carter, c. W.

2026-03-25 biophysics 10.64898/2026.03.23.713518 medRxiv
Top 0.3%
2.1%
Show abstract

Eukaryotes have distinct nuclear genes for tryptophanyl-tRNA synthetase (TrpRS). Human mitochondrial (Hmt) TrpRS (also WARS2) shares only 14% sequence identity with human cytoplasmic (Hc)TrpRS, but 41% with Bacillus stearothermophilus (Bs)TrpRS. Tryptophan binding to BsTrpRS is largely promoted by hydrophobic interactions and recognition of the indole nitrogen by side chains of Met129 and Asp132. The non-reactive analog indolmycin can recruit unique polar interactions to form an active-site metal coordination that lies off the normal mechanistic path, enhancing affinity to BsTrpRS and other prokaryotic TrpRS enzymes by 1500-fold over its tryptophan substrate. By contrast, human WARS2, complements nonpolar interactions for tryptophan binding with additional electrostatic and hydrogen bonding interactions that are inconsistent with indolmycin binding. We report here a 1.82 [A] crystal structure of an HmtTrpRS* indolmycin*Mn2+*ATP complex, showing that mitochondrial and bacterial enzymes use similar determinants to bind both ATP and indolmycin. ATP forms tight electrostatic interactions between the catalytic metal ion and a non-bridging oxygen atom from each phosphate group. Hydrogen bonds between the oxazolinone group and active-site residues create an off-path ground-state configuration. This arrangement closely mimics that in the corresponding BsTrpRS complex but varies greatly from ATP binding to HcTrpRS, Moreover, isothermal titration calorimetry demonstrates that, as for BsTrpRS, Mg2+*ATP, but not ATP alone, enhances indolmycin binding affinity [~]100-fold with a supplemental {Delta}({Delta}G) of [~] -3 kcal/mol. Structural, thermodynamic, and kinetic similarities confirm our previous conclusion that a reinforced ground-state Mg2+ ion configuration contributes to the high indolmycin affinity in the bacterial system.

12
Residue burial encodes a protein's fold

Grigas, A. T.; Sumner, J.; O'Hern, C. S.

2026-03-31 biophysics 10.64898/2026.03.28.714986 medRxiv
Top 0.3%
2.1%
Show abstract

Protein structure is controlled by a high-dimensional energy landscape, which is a function of all of the atomic coordinates of the protein. Can this landscape be accurately described by a low-dimensional representation? We find that residue core identity, a binary N-dimensional encoding indicating whether each of the N amino acids in a protein is buried in the core or not, can predict the proteins backbone conformation more efficiently than all other representations that we tested. Core identity is 4 times more efficient than previous estimates of the bits per residue needed to encode a proteins native fold, 2 times more efficient than the C contact map, and 1.5 times more efficient than the machine-learned embeddings from FoldSeeks 3Di. Even when the folded structure is unavailable, predicting each residues burial from sequence yields a more accurate estimate of fold quality than predicting pairwise contacts from the same sequence information. Thus, this work emphasizes that the problem of determining a proteins native fold can be re-framed as predicting each residues core identity.

13
Structural Mechanism of TRPC3 Channel Activation by the Moonwalker Mutation

Zang, J.; Tan, Y.; Chen, Y.; Guo, W.; Zhao, X.; Peng, H.; Chen, L.

2026-04-06 biophysics 10.64898/2026.04.03.716262 medRxiv
Top 0.3%
2.1%
Show abstract

TRPC3 is a calcium-permeable, non-selective cation channel that is activated by DAG. It is expressed in several tissues, especially in the cerebellum, and has been implicated in various human diseases. Despite recent progress in understanding the structural mechanism of TRPC3, how the channel opens remains elusive. Here, we present structures of hTRPC3 in an agonist-free resting state, determined using a DAG-binding site mutant. We also present the structure of hTRPC3 in a DAG-bound open state, determined using a constitutively active "moonwalker" (T561A) mutant. These structures, together with electrophysiological results, reveal that the T561A mutation activates hTRPC3 by disrupting a polar interaction with N652. A newly formed {pi}-bulge in S6 leads to rotation and outward tilting of the lower half of S6, resulting in dilation of the pore and thus channel opening. Agonist DAG stabilizes hTRPC3 in the open conformation. BTDM exerts its inhibitory effect by pushing S5 and S6 back to the center to close the pore, while preserving the {pi}-bulge. These results shed light on the opening mechanism of hTRPC3.

14
Strategic template filtering accelerates fragment-based peptide docking

Trabelsi, N.; Varga, J. K.; Khramushin, A.; Lyskov, S.; Schueler-Furman, O.

2026-03-30 bioinformatics 10.64898/2026.03.26.714397 medRxiv
Top 0.4%
2.1%
Show abstract

Peptide-protein interactions are often transient and structurally elusive, necessitating computational approaches to identify both binding sites and peptide conformations. PatchMAN, one of the leading but computationally expensive biophysic-based global peptide-docking protocols, addresses this challenge by treating peptide docking as a protein-folding problem, using structural motifs from solved structures as templates that are subsequently refined using Rosetta FlexPepDock. Here we present PatchMAN2, which introduces 1) strategic fragment filtering and 2) local docking modes that focus sampling on relevant surfaces or known binding regions, thereby reducing the high computational cost of the original implementation due to extensive refinement of many non-productive low-quality fragments. Benchmarking shows that PatchMAN2 removes [~]30-70% of unnecessary fragments while preserving accuracy, substantially reducing runtime and improving the practical efficiency of peptide-protein docking.

15
IDBSpred: An intrinsically disordered binding site predictor using machine learning and protein language model

Jones, D.; Wu, Y.

2026-03-30 bioinformatics 10.64898/2026.03.27.714773 medRxiv
Top 0.4%
1.8%
Show abstract

Intrinsically disordered proteins (IDPs) mediate many cellular functions through interactions with structured protein partners, but predicting the corresponding binding sites on the structured partner remains challenging. Here, we present IDBSpred, a sequence-based method for residue-level prediction of IDP-binding sites on structured proteins. Training and test data were collected from the DIBS database, which contains more than 700 non-redundant IDP-protein complexes. Residue-level embeddings of structured partner sequences were generated using the ESM-2 protein language model and used as input to a multilayer perceptron classifier for binary prediction of binding versus non-binding residues. Analysis of amino acid composition showed that IDP-binding sites are enriched in aromatic residues, especially Trp, Tyr, and Phe, as well as several charged and polar residues, whereas Ala and several small or conformationally restrictive residues are depleted. The classifier achieved an ROC AUC of 0.87 and an average precision of 0.61. Structural case studies further showed that the predicted sites largely recapitulate the major experimentally defined binding interfaces. These results demonstrate that protein language model embeddings plus machine learning algorithms can effectively capture sequence features associated with IDP recognition on structured proteins. IDBSpred provides a practical framework for studying IDP-mediated interfaces and identifying potential therapeutic hotspots.

16
Hydration and hydrolysis define antibiotic resistance conferred by macrolide esterases

Kelly, E. T. R.; Myziuk, I.; Hemmings, M. Z.; Mulla, Z.; Blanchet, J.; Ruzzini, A.; Berghuis, A. M.

2026-03-25 biochemistry 10.64898/2026.03.24.713787 medRxiv
Top 0.4%
1.8%
Show abstract

Macrolides are an antibiotic class widely used in both human and veterinary medicine, and function by interfering with protein synthesis. Regrettably, numerous strategies for evading the antibiotic properties of macrolides have been found in bacteria, including enzyme-mediated inactivation. These mechanisms are now widely disseminated among pathogenic, animal-associated and environmental bacteria making them a One Health issue. Macrolide esterases, which hydrolyze the macrolactones ester bond, confer one such resistance mechanism. Two types of macrolide esterases have thus far been identified, the well-studied erythromycin esterases and the recently discovered Est-type enzymes that belong to the /{beta}-hydrolase superfamily. We present detailed structure-function studies for four diverse Est type esterases: which only share 44-66% sequence identity (EstTSf, EstTSt, EstTBc, and EstXEc). In addition to resistance profiling and substrate specificity studies, we present structures for all four enzymes, including structures for EstTBc and EstXEc in complex with tylosin and tylvalosin macrolides, post hydrolysis. Complementing the data with mutational and kinetic studies allowed for a detailed analysis of the structural basis for macrolide-enzyme interactions. Combined the data suggest that promiscuous binding and imprecise positioning, mediated by a water-cage, dictate substrate specificity for Est-type macrolide resistance enzymes. These insights may prove beneficial for next-generation antibiotic development.

17
Decoupling Topology from Geometry: Detecting Large-Scale Conformational Changes via Conformational Scanning

Lin, R.; Ahnert, S. E.

2026-03-31 bioinformatics 10.64898/2026.03.28.714756 medRxiv
Top 0.4%
1.7%
Show abstract

Protein function is fundamentally driven by structural dynamics, yet the majority of structural bioinformatics treats proteins as static rigid bodies. While Molecular Dynamics (MD) simulations attempt to capture these motions, they are computationally prohibitive for exploring large-scale conformational changes, such as domain movements or allostery, which occur on timescales often inaccessible to standard simulation. However, the Protein Data Bank (PDB) contains a latent wealth of dynamic information in the form of redundant entries proteins solved in multiple distinct conformational states. Detecting these "shape-shifting" pairs remains challenging because standard structural alignment algorithms (e.g., TM-align) rely on rigid-body superposition, which fails when substantial geometric rearrangement occurs. In this study, we introduce a high-throughput method to systematically mine the PDB for proteins that share identical topology but exhibit divergent tertiary conformations. By utilizing a coarse-grained Secondary Structure Element (SSE) representation, we decouple topological connectivity from geometric rigidity, allowing for the detection of conformational homologues that share low global structural similarity despite high predicted structural similarity. We applied this "conformational scanning" across the entire RCSB database, identifying a curated dataset of proteins undergoing significant structural rearrangements. This work bridges the gap between static structural data and dynamic function, providing a critical "ground truth" dataset for benchmarking data-driven protein design and checking the plausibility of generative structure models.

18
Characterizing the endopeptidase activity of Candida albicans Gpi8, a crucial subunit of the GPI transamidase

Cherian, I.; Shefali, S.; Maurya, D. S.; Khan, F. M.; Komath, S. S.

2026-04-09 biochemistry 10.64898/2026.04.07.717003 medRxiv
Top 0.4%
1.7%
Show abstract

GPI-anchored proteins are crucial cell surface proteins with diverse, organism-specific functions, in eukaryotes. They are produced when the GPI transamidase (GPIT), a five-subunit membrane-bound enzyme complex, attaches a pre-formed GPI anchor to the C-terminal end of nascent proteins on the lumenal face of the endoplasmic reticulum. This process requires the removal of a C-terminal signal sequence (SS) on the substrate protein by the action of an endopeptidase subunit of the GPIT, Gpi8/ PIG-K. Using an AMC-tagged peptide in a cell free (post-mitochondrial fraction) assay, this manuscript studies the steady state kinetics of enzymatic cleavage of the substrate by GPIT of the human pathogenic fungus, C. albicans. We show that Mn+2 enhances activity by improving substrate binding but plays no direct role in substrate cleavage per se. Molecular dynamics simulations suggest that the divalent cation binds at a site away from the active site but provides compactness and stability to Gpi8. It also enables a conformation in which a flexible loop (219-244 residues) in the vicinity of the catalytic pocket is able to interact with and position the scissile bond for cleavage by Cys202. Steady state kinetics also indicate that peptides of lengths 7-mer to 9-mer are better bound than 4-mer or 15-mer peptide substrates. A bulky residue at the site of cleavage reduces the catalytic activity of the GPIT. This is the first detailed steady state kinetics study on the endopeptidase activity of a GPIT from any organism.

19
Molecular basis of protein-glycan cross-linking by CpCBM92A revealed by NMR spectroscopy

Trooyen, S. H.; Ruoff, M. S.; McKee, L. S.; Courtade, G.

2026-04-10 biophysics 10.64898/2026.04.08.717144 medRxiv
Top 0.5%
1.7%
Show abstract

Our current understanding of carbohydrate-binding module (CBM) function is limited by the fact that most CBM research has focused on single-binding-site modules. CBM family 92 (CBM92) is a recently characterized family of predominantly trivalent proteins that bind {beta}-1,3- and {beta}-1,6-glucans with high specificity. CpCBM92A from Chitinophaga pinensis stands out as the first trivalent member of the family to be structurally determined. Multivalent CBM families are rare, and the way in which the three binding sites cooperate in ligand recognition remains unclear. Here, we use NMR spectroscopy to demonstrate how each of the proteins binding sites plays distinct roles in ligand binding. One binding site, referred to as the {beta} site, can be identified as the primary attachment point because of its higher affinity for all tested ligands, consistent with previous biochemical data suggesting it is the strongest binding site on CpCBM92A. The other two binding sites, referred to as and {gamma}, preferentially bind longer segments of {beta}-1,3- and {beta}-1,6-glucan chains, respectively. We further show that the glycosidic bond position and anomeric configuration of the binding glucosyl unit strongly affects protein affinity due to a preferred ligand pose in the binding sites. Our results provide insight into how the trivalent architecture of CBM92 might enable cross-linking of scleroglucan chains, which may guide the development of new applications for CBMs in biotechnology.

20
emb2dis: a novel protein disorder prediction tool based on ResNets, dilated convolutions & protein language models

Duarte, S. A.; Mehdiabadi, M.; Bugnon, L. A.; Aspromonte, M. C.; Piovesan, D.; Milone, D. H.; Tosatto, S.; Stegmayer, G.

2026-04-01 bioinformatics 10.64898/2026.03.30.715414 medRxiv
Top 0.5%
1.7%
Show abstract

Intrinsically disordered proteins (IDPs) play an important role in a wide range of biological functions and are linked to several diseases. Due to technical difficulties and the high cost of experimental determination of disorder in proteins, combined with the exponential increase of unannotated protein sequences, the development of computational methods for disorder prediction became an active area of research in the last few decades. In this work, we present emb2dis, a deep learning model that uses protein language models (pLMs) to predict disorder from sequence. The emb2dis tool is a pre-trained model that receives as input a protein sequence, calculates its pLM embedding and passes it to a deep learning model. In contrast to existing approaches, emb2dis integrates informative sequence representations with a novel architecture that combines residual networks (ResNets) and dilated convolutions. This design effectively enlarges the receptive field of the convolution operation, enabling the model to better capture an extended context of each amino acid. At the output, emb2dis assigns a disorder propensity score to each residue in the sequence. The model was evaluated on datasets from the latest CAID3 blind benchmark for disorder prediction, where it achieved first place in the Disorder-PDB category, exhibiting strong performance with high AUC and Fmax scores. Additionally, it ranked among the top ten methods on the Disorder-NOX dataset. We provide a freely available web-demo for emb2dis and a source code repository for local installation. Weblink for the toolhttps://sinc.unl.edu.ar/web-demo/emb2dis/ The importance of the emb2dis tool is that it provides a new deep learning approach and significant improvements in the prediction of protein disorder, with a simple web interface and graphical output detailing per-residue disorder.